29 research outputs found

    Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation

    Full text link
    In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension and generation tasks. We have the following main findings. First, for the zero-shot setting, instructed LLMs are very competitive on code comprehension and generation tasks and sometimes even better than small SOTA models specifically fine-tuned on each downstream task. We also find that larger instructed LLMs are not always better on code-related tasks. Second, for the few-shot setting, we find that adding demonstration examples substantially helps instructed LLMs perform better on most code comprehension and generation tasks; however, the examples would sometimes induce unstable or even worse performance. Furthermore, we find widely-used BM25-based shot selection strategy significantly outperforms the basic random selection or fixed selection only on generation problems. Third, for the fine-tuning setting, we find that fine-tuning could further improve the model performance on downstream code comprehension and generation tasks compared to the zero-shot/one-shot performance. In addition, after being fine-tuned on the same downstream task dataset, instructed LLMs outperform both the small SOTA models and similar-scaled LLMs without instruction tuning. Based on our findings, we further present practical implications on model and usage recommendation, performance and cost trade-offs, and future direction

    Recommending Analogical APIs via Knowledge Graph Embedding

    Full text link
    Library migration, which re-implements the same software behavior by using a different library instead of using the current one, has been widely observed in software evolution. One essential part of library migration is to find an analogical API that could provide the same functionality as current ones. However, given the large number of libraries/APIs, manually finding an analogical API could be very time-consuming and error-prone. Researchers have developed multiple automated analogical API recommendation techniques. Documentation-based methods have particularly attracted significant interest. Despite their potential, these methods have limitations, such as a lack of comprehensive semantic understanding in documentation and scalability challenges. In this work, we propose KGE4AR, a novel documentation-based approach that leverages knowledge graph (KG) embedding to recommend analogical APIs during library migration. Specifically, KGE4AR proposes a novel unified API KG to comprehensively and structurally represent three types of knowledge in documentation, which can better capture the high-level semantics. Moreover, KGE4AR then proposes to embed the unified API KG into vectors, enabling more effective and scalable similarity calculation. We build KGE4AR' s unified API KG for 35,773 Java libraries and assess it in two API recommendation scenarios: with and without target libraries. Our results show that KGE4AR substantially outperforms state-of-the-art documentation-based techniques in both evaluation scenarios in terms of all metrics (e.g., 47.1%-143.0% and 11.7%-80.6% MRR improvements in each scenario). Additionally, we explore KGE4AR' s scalability, confirming its effective scaling with the growing number of libraries.Comment: Accepted by FSE 202

    ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation

    Full text link
    In this work, we make the first attempt to evaluate LLMs in a more challenging code generation scenario, i.e. class-level code generation. We first manually construct the first class-level code generation benchmark ClassEval of 100 class-level Python code generation tasks with approximately 500 person-hours. Based on it, we then perform the first study of 11 state-of-the-art LLMs on class-level code generation. Based on our results, we have the following main findings. First, we find that all existing LLMs show much worse performance on class-level code generation compared to on standalone method-level code generation benchmarks like HumanEval; and the method-level coding ability cannot equivalently reflect the class-level coding ability among LLMs. Second, we find that GPT-4 and GPT-3.5 still exhibit dominate superior than other LLMs on class-level code generation, and the second-tier models includes Instruct-Starcoder, Instruct-Codegen, and Wizardcoder with very similar performance. Third, we find that generating the entire class all at once (i.e. holistic generation strategy) is the best generation strategy only for GPT-4 and GPT-3.5, while method-by-method generation (i.e. incremental and compositional) is better strategies for the other models with limited ability of understanding long instructions and utilizing the middle information. Lastly, we find the limited model ability of generating method-dependent code and discuss the frequent error types in generated classes. Our benchmark is available at https://github.com/FudanSELab/ClassEval

    Evaluating and Improving Unified Debugging

    No full text
    Automated debugging techniques, including fault localization and program repair, have been studied for over a decade. However, the only existing connection between fault localization and program repair is that fault localization computes the potential buggy elements for program repair to patch. Recently, a pioneering work, ProFL, explored the idea of unified debugging to unify fault localization and program repair in the other direction for the first time to boost both areas. In this way, ProFL also extends the application scope of automated repair to all possible bugs (not only the small ratio of bugs that repair systems can automatically fix). However, ProFL only considers one program repair system, and it is not clear how other repair systems contribute to unified debugging. In this work, we perform an extensive study of the unified debugging approach on 16 state-of-the-art program repair systems for the first time. Our initial experimental results on the Defects4J benchmark reveal various practical guidelines for unified debugging, such as (1) nearly all 16 studied repair systems positively contribute to unified debugging despite their varying repair capabilities, (2) repair systems targeting multi-edit patches can bring extraneous noise, (3) repair systems with more executed/plausible patches tend to perform better, (4) unified debugging effectiveness does not rely on the availability of correct patches, and (5) we propose a new technique, UniDebug++, which localizes over 20% more bugs within Top-1 than state-of-the-art technique ProFL. Furthermore, we extend the above experiments to make the following additional contributions: we (6) further perform an extensive study on 76.3% additional bugs and confirm that UniDebug++ again outperforms ProFL by localizing 185 (out of 395) bugs within Top-1, (7) investigate the impact of 33 SBFL formulae and observe UniDebug++ consistently improving upon all formulae, (8) demonstrate that UniDebug++ can substantially boost state-of-the-art learning-based method-level fault localization techniques, (9) extend unified debugging to the statement level for first time and observe that UniDebug++ localizes 78 (out of 395) bugs within Top-1 and outperforms state-of-the-art learning-based fault localization techniques by 30%, and finally (10) propose a new technique, UniDebug+*, based on detailed patch statistics, to further improve upon UniDebug++

    Development and Hybrid Position/Force Control of a Dual-Drive Macro-Fiber-Composite Microgripper

    No full text
    This paper reports on the development, implementation and hybrid control of a new micro-fiber-composite microgripper with synchronous position and force control capabilities. In particular, the micro-fiber-composite actuator was composed of rectangular piezoelectric fibers covered by interdigitated electrodes and embedded in structural epoxy. Thus, the micro-fiber-composite microgripper had a larger displacement-volume ratio (i.e., the ratio of the output displacement to the volume of the microgripper) than that of a traditional piezoelectric one. Moreover, to regulate both the gripper position and the gripping force simultaneously, a hybrid position/force control scheme using fuzzy sliding mode control and the proportional-integral controller was developed. In particular, the fuzzy sliding mode control was used to achieve the precision position control under the influence of the system disturbances and uncertainties, and the proportional-integral controller was used to guarantee the force control accuracy of the microgripper. A series of experimental investigations was performed to verify the feasibility of the developed microgripper and the control scheme. The experimental results validated the effectiveness of the designed microgripper and hybrid control scheme. The developed microgripper was capable of precision and multiscale micromanipulation tasks

    Experimental Identification and Vibration Control of A Piezoelectric Flexible Manipulator Using Optimal Multi-Poles Placement Control

    No full text
    This paper presents experimental identification and vibration suppression of a flexible manipulator with piezoelectric actuators and strain sensors using optimal multi-poles placement control. To precisely identify the system model, a reduced order transfer function with relocated zeros is proposed, and a first-order inertia element is added to the model. Comparisons show the identified model match closely with the experimental results both in the time and frequency domains, and a fit of 97.2% is achieved. Based on the identified model, a full-state multi-poles placement controller is designed, and the optimal locations of the closed loop poles are determined where the move distance of the closed loop poles is the shortest. The feasibility of the proposed controller is validated by simulations. Moreover, the controller is tested for different locations of the closed loop poles, and an excellent performance of the optimal locations of the closed loop poles is shown. Finally, the effectiveness of the proposed controller is demonstrated by experiments. Results show that the vibrations of the expected modes are significantly diminished. Accordingly, multi-mode vibrations of the manipulator are well attenuated

    Experimental Identification and Vibration Control of A Piezoelectric Flexible Manipulator Using Optimal Multi-Poles Placement Control

    Get PDF
    This paper presents experimental identification and vibration suppression of a flexible manipulator with piezoelectric actuators and strain sensors using optimal multi-poles placement control. To precisely identify the system model, a reduced order transfer function with relocated zeros is proposed, and a first-order inertia element is added to the model. Comparisons show the identified model match closely with the experimental results both in the time and frequency domains, and a fit of 97.2% is achieved. Based on the identified model, a full-state multi-poles placement controller is designed, and the optimal locations of the closed loop poles are determined where the move distance of the closed loop poles is the shortest. The feasibility of the proposed controller is validated by simulations. Moreover, the controller is tested for different locations of the closed loop poles, and an excellent performance of the optimal locations of the closed loop poles is shown. Finally, the effectiveness of the proposed controller is demonstrated by experiments. Results show that the vibrations of the expected modes are significantly diminished. Accordingly, multi-mode vibrations of the manipulator are well attenuated
    corecore